feat(index): share IVF partition scans across batch vector queries by sezruby · Pull Request #2 · sezruby/lance

sezruby · 2026-06-15T23:08:08Z

Fork-internal review PR. Targets sezruby:main so I can review before opening against lance-format/lance. Implements #6822.

Summary

Batch vector search (#6821, PR lance-format#6828) made indexed multi-query search work by looping the full single-query plan once per query vector (re-opening the index and rebuilding the prefilter each time) and unioning the results. This PR makes the indexed/ANN path share index-level state across the batch: it reads each IVF partition's storage once and scores every query that probes it.

Approach

VectorIndex trait (rust/lance-index/src/vector.rs): add supports_batch_partition_search() and search_partitions_batch(...), both defaulted (default returns a not_supported error), so non-IVF indices remain explicitly unsupported.
IVFIndex (rust/lance/src/index/vector/ivf/v2.rs): implement the batch search for flat-style sub-indices (IVF_FLAT/PQ/SQ/RQ, i.e. supports_global_topk_heap()). It inverts the per-query partition lists, loads each distinct partition once, and accumulates one top-k heap per query against the loaded storage, reusing accumulate_prepared_partition_search / global_heap_to_batch. The prefilter is built once and shared across all queries.
ANNIvfBatchExec (rust/lance/src/io/exec/knn.rs): a single exec node that ranks every query against the centroids, runs the shared-scan batch search per delta, merges per-query top-k across deltas, and emits {query_index, _distance, _rowid} sorted by (query_index, _distance, _rowid).
Routing (rust/lance/src/dataset/scanner.rs): batch_indexed_vector_search takes the new fast path when every segment is an IVF flat-style index and the scan is fully indexed (or fast_search); otherwise it falls back to the existing per-query loop (HNSW, refine_factor, mixed indexed/unindexed). No behavior regression.

Known limitations (documented, deferred)

Per-query nprobes are honored statically (minimum_nprobes); the adaptive late-search expansion of the single-query path is not applied. Recall therefore matches repeated single-query search at fixed nprobes (the common case).
HNSW batch sharing, batch refine_factor/reranking, and batch + unindexed-fragment combine fall back to the per-query loop (follow-ups).

Test plan

cargo test -p lance --lib test_batch_knn — incl. updated test_batch_knn_indexed (asserts the ANNIvfBatch plan and that batch results equal repeated single-query indexed search) and new test_batch_knn_indexed_refine_falls_back.
cargo test -p lance --lib dataset::scanner::test::test_knn (29) and index::vector::ivf::v2 (88) — no regressions.
cargo fmt --all && cargo clippy -p lance -p lance-index --tests --benches -- -D warnings.
Python: uv run pytest python/tests/test_vector_index.py -k batch → 6 passed, incl. new test_batch_indexed_query_matches_repeated_single_queries (3-query + single-query). ruff check / ruff format --check clean.
Added a batch-vs-repeated-single ANN benchmark (benchmarks/test_search.py). Note: the shared datasets benchmark fixture is currently broken in this checkout (pre-existing use_legacy_format deprecation → error, also fails test_ann_no_refine), so I validated performance with a standalone script instead:

50k rows, dim 128, IVF_PQ (64 partitions), m=32 queries, k=10, nprobes=8:

median QPS

repeated single-query 14.04 ms 2279

batch (shared scan) 5.66 ms 5655

2.48× speedup

Closes lance-format#6822

Extend batch vector search (lance-format#6821) to the indexed/ANN path so a single multi-query request reads each IVF partition's storage once and scores every query that probes it, instead of re-running a full single-query plan per vector and unioning the results. - Add `VectorIndex::search_partitions_batch` + `supports_batch_partition_search` (defaulted so non-IVF indices stay explicitly unsupported). - Implement them for `IVFIndex` with a flat-style sub-index (IVF_FLAT/PQ/SQ/RQ): load each distinct partition once and accumulate one top-k heap per query, sharing the prefilter across the whole batch. - Add `ANNIvfBatchExec`, which ranks every query against the centroids, runs the shared-scan batch search, and emits `query_index`-tagged results; route to it from `Scanner::batch_indexed_vector_search` when the index family supports it. - Fall back to the existing per-query loop for HNSW, `refine_factor`, and mixed indexed/unindexed scans, so behavior never regresses. Per-query nprobes are honored statically (no adaptive late expansion), so recall matches repeated single-query search at fixed nprobes. Closes lance-format#6822 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions Bot added enhancement New feature or request A-index A-python labels Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(index): share IVF partition scans across batch vector queries#2

feat(index): share IVF partition scans across batch vector queries#2
sezruby wants to merge 1 commit into
mainfrom
knn-batch-6822

sezruby commented Jun 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	median	QPS
repeated single-query	14.04 ms	2279
batch (shared scan)	5.66 ms	5655
		2.48× speedup

Conversation

sezruby commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Approach

Known limitations (documented, deferred)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sezruby commented Jun 15, 2026 •

edited

Loading